Bispectral Pedestrian Detection Augmented with Saliency Maps using Transformer

Mohamed Amine Marnissi ¹ Ikram Hattab ¹ Hajer Fradi ¹ Anis Sahbani ² Najoua Essoukri Ben Amara ¹

¹ LATIS-Laboratory of Advanced Technology and Intelligent Systems, 4023, Sousse, Tunisie;
² Enova Robotics, 4023, Sousse, Tunisie;

Materials
Results
Citation

The architecture of the proposed fusion scheme of visible and thermal images augmented by saliency maps for pedestrian detection.

Highlights

Automatic Pedestrian Detection for Surveillance Applications: This paper addresses the problem of automatic pedestrian detection in surveillance scenarios. The focus is on real-time detection using both visible and thermal cameras, leveraging their complementary aspects for improved accuracy.
Fusion Network and Visual Saliency Transformation: The proposed approach introduces a fusion network that combines features from visible and thermal camera inputs. Augmentation through visual saliency transformation enhances the fusion process. The fusion network is integrated into the YOLO-v3 architecture, resulting in a detection model trained in a paired setting.
Superior Results and Low Computational Cost: Extensive experiments conducted on the KAIST multi-spectral dataset demonstrate the superiority of the proposed fusion framework. The approach outperforms single-input methods and other fusion schemes, showcasing its effectiveness. Moreover, the proposed approach boasts a significant advantage of a very low computational cost, making it suitable for real-time applications. Additional tests on a security robot further validate its performance.

Materials

Abstract

In this paper, we focus on the problem of automatic pedestrian detection for surveillance applications. Particularly, the main goal is to perform real-time detection from both visible and thermal cameras for complementary aspects. To handle that, a fusion network that uses features from both inputs and performs augmentation by means of visual saliency transformation is proposed. This fusion process is incorporated into YOLO-v3 as base architecture. The resulting detection model is trained in a paired setting in order to improve the results compared to the detection of each single input. To prove the effectiveness of the proposed fusion framework, several experiments are conducted on KAIST multi-spectral dataset. From the obtained results, it has been shown superior results compared to single inputs and to other fusion schemes. The proposed approach has also the advantage of a very low computational cost, which is quite important for real-time applications. To prove that, additional tests on a security robot are presented as well.

Results

1. Deep Map

2. Qualitative detection

Citation

@inproceedings{DBLP:conf/visapp/MarnissiHFSA22,
  author    = {Mohamed Amine Marnissi and Ikram Hattab and Hajer Fradi and Anis Sahbani and Najoua Essoukri Ben Amara},
  title     = {Bispectral Pedestrian Detection Augmented with Saliency Maps using Transformer},
  booktitle = {Proceedings of the 17th International Joint Conference on Computer
               Vision, Imaging and Computer Graphics Theory and Applications, {VISIGRAPP}},
  year      = {2022}}

Contact

If you have any question, please contact Mohamed Amine Marnissi at mohamed.amine.marnissi@gmail.com.